Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking Using PF/Tijah

نویسندگان

  • Theodora Tsikrika
  • Pavel Serdyukov
  • Henning Rode
  • Thijs Westerveld
  • Robin Aly
  • Djoerd Hiemstra
  • Arjen P. de Vries
چکیده

CWI and University of Twente used PF/Tijah, a flexible XML retrieval system, to evaluate structured document retrieval, multimedia retrieval, and entity ranking tasks in the context of INEX 2007. For the retrieval of textual and multimedia elements in the Wikipedia data, we investigated various length priors and found that biasing towards longer elements than the ones retrieved by our language modelling approach can be useful. For retrieving images in isolation, we found that their associated text is a very good source of evidence in the Wikipedia collection. For the entity ranking task, we used random walks to model multi-step relevance propagation from the articles describing entities to all related entities and further, and obtained promising results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating Structured Information Retrieval and Multimedia Retrieval Using PF/Tijah

We used a flexible XML retrieval system for evaluating structured document retrieval and multimedia retrieval tasks in the context of the INEX 2006 benchmarks. We investigated the differences between article and element retrieval for Wikipedia data as well as the influence of an elements context on its ranking. We found that article retrieval performed well on many tasks and that pinpointing th...

متن کامل

Efficient XML and Entity Retrieval with PF/Tijah: CWI and University of Twente at INEX'08

PF/Tijah is a research prototype created by the University of Twente and CWI Amsterdam with the goal to create a flexible environment for setting up search systems. By integrating the PathFinder (PF) XQuery system [1] with the Tijah XML information retrieval system [2] it combines database and information retrieval technology. The PF/Tijah system is part of the open source release of MonetDB/XQ...

متن کامل

Optimizing XML Information Retrieval Query Execution at the Physical Level

XML is emerging as a standard format for information interchange and storage of structured information. The wide-spread use of XML has sparked the interest of both the database and information retrieval research communities. XML databases are designed to store and query large volumes of XML data. Structured information retrieval or XML-IR is the application of information retrieval concepts and...

متن کامل

Managing structured queries in probabilistic XML retrieval systems

Focusing on the context of XML retrieval, in this paper we propose a general methodology for managing structured queries (involving both content and structure) within any given structured probabilistic information retrieval system which is able to compute posterior probabilities of relevance for structural components given a non-structured query (involving only query terms but not structural re...

متن کامل

Speech Retrieval Experiments using XML Information Retrieval

This report presents the University of Twente’s first cross-language speech retrieval experiments in Cross-Language Evaluation Forum (CLEF). It describes the issues our contribution was focusing on, it describes the PF/Tijah XML Information Retrieval system that was used and it discusses the results for both the monolingual English and the Dutch-English crosslanguage spoken document retrieval (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007